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L14: Entry 16 of 22 



File: USPT 



Jun 24, 



1986 



DOCUMENT- IDENTIFIER: US 4597053 A 

TITLE: Two-pass multiplier/accumulator circuit 



Abstract Text (1) : 

A two-pass Multiplier /Accumulator Circuit is provided which performs various 
arithmetic operations on operands contained within an X Register 10 (FIG. 1) and a 

Y Register 20 and places the result in an Accumulator Register 40. The arithmetic 
operations are carried out by passing the product of the operands successively 
through an array of adders in the Adder unit 34. Each adder adds an appropriate 
multiple of the contents of the X Register to the Accumulator 40 or to the output 
of the previous adder. The multiples are selected according to the contents of the 

Y Register. 

Brief Summary Text (2) : 

This invention relates generally to digital logic circuits, and, in particular, to 
a digital two-pass multiplier /accumulator circuit. 

Brief Summary Text (5) : 

In the digital arithmetic technology it is a known problem that digital 
multiplication and division require complex circuitry and/or complex routines to 
execute. In the manufacture of high capability, high quality electronic equipment, 
it is often necessary to provide high speed digital multiplication and divide 
capability. For such equipment to be economically competitive demands that the 
manufacturer keep the component costs relatively low. Thus there is a pressing need 
for a relatively low cost circuit for performing high speed arithmetic operations, 
including multiplication and division. 

Brief Summary Text (8) : 

Accordingly, it is an object of the present invention to provide an improved Two- 
Pass Multiplier /Accumulator Circuit. 

Brief Summary Text (9) : 

It is also an object of the present invention to provide a Two-Pass 

Multiplier /Accumulator Circuit which performs its operations in a pipelined manner 

to achieve high speed operation. 

Brief Summary Text (10): 

It is a further object of the present invention to provide a Two-Pass 
Multiplier /Accumulator Circuit which is capable of determining the maximum or 
minimum value in a sequence of digital numbers. 

Drawing Description Text (3) : 

FIG. 1 shows a block diagram illustrating a preferred embodiment of a Two-Pass 
Multiplier /Accumulator Circuit of the present invention. 

Drawing Description Text ( 4 ) : 

FIG. 2A shows a timing diagram illustrating the clock timing for the Synchronous 
Mode (Minimum Cycle Time) of the Multiplier /Accumulator of the present invention. 

Drawing Description Text (5) : 
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FIG . 2B shows a timing diagram illustrating the clock timing for the Synchronous 
Mode (Extended Cycle) of the Multiplier /Accumulator of the present invention. 

Detailed Description Text (2) : 

The Multiplier -Accumulator-RAM circuit (MAR) contains a 16-bit 

Multiplier /Accumulator (M/A) 1 and a 256-word by 16-bit Random-Access-Memory (RAM) 
50 with control circuitry to execute an extensive set of arithmetic and logic 
functions on data contained in the on-chip RAM or in external devices, as well as a 
large variety of data transfers between the internal parts of the chip (M/A and 
RAM) and other external devices. A functional block diagram of the MAR is shown in 
FIG . 1. This diagram is functional only; and the actual implementation may vary. 

Detailed Description Text (5) : 
MULTIPLIER / ACCUMULATOR/ RAM COMPONENTS 

Detailed Description Text (19) : 
Multiply Group 

Detailed Description Text (20) : 
Delayed Multiply Group 

Detailed Description Text (35) : 

Operation of Digital Multiplier /Accumulator 

Detailed Description Text (36) : 
Operation of Pipelined Multiplier 

Detailed Description Text (38) : 
Multiplier /Accumulator /RAM Components 

Detailed Description Text (39) : 

Tne Multiplier /Accumulator portion 1 of the circuit forms the arithmetic product of 
two 16-bit numbers contained in registers X and Y, and adds or subtracts this 
product to or from a 4 0-bit Accumulator 4 0 (AX, AH, AL) . The M/A can also perform 
the logical operations AND and EXCLUSIVE OR on the contents of X and AH or AL, and 
it can shift the entire 40-bit Accumulator contents left or right for 
normalization, or for multiplication or division by 2 . Except for the logical 
operation, all data are treated as signed numbers using two ' s-complement notation. 
A software-controlled mode allows the data to be treated as integers or fractions. 
The M/A is controlled by the F(0-4) function inputs when enabled by the proper C(0- 
4) control inputs. 

Detailed Description Text (43) : 

The 16-bit Y Register 20 holds the second operand in Sign Multiply and all ordinary 
Multiply operations. 

Detailed Description Text (47): 

The 5-bit input F(0-4) to Function Decode 70 provides 32 M/A functions, including 

multiply with or without accumulation, add, subtract, compare, absolute value, 
X.sign of Y, negate, divide, maximum, and minimum. 

Detailed Description Text (51) : 

During one cycle of operation, the MAR decodes the Control and Function inputs, and 
if so directed, performs a data transfer and a function. In general, the C input 
controls the data transfer between the RAM, M/A, and DB, and the F input specifies 
the register in the M/A which is to be read or loaded and the function to be 
performed in the M/A 1. The function may consist of setting modes or setting or 
starting an operation. Operations take place during the cycle or cycles following 
the execution of the function, operating on the data in X, Y, and A, and usually 
placing the result in A. Except for the logical operations (AND and EOR) , all data 
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are treated as fixed point, two 1 s-complement signed numbers, in either integer or 
fraction representation. The integer or fraction interpretation is controlled by a 
mode bit. The most-significant bit of X, Y, or AX indicates the sign of x, Y, or 
the entire 40-bit Accumulator. The bit-weighting of all registers is shown in Table 
1 for integer and fraction representation. 

Detailed Description Text (52) : 

The X and Y registers are fully buffered so that additional data transfers and 
functions may be performed while an operation is in progress in a "pipeline" 
fashion. For example, in doing a multiply -accumulate of the form, 
A=Xl.multidot . Y1+X2 .multidot .X2+- - - , two operands can be loaded, while the 
previous operands are being operated upon. This is described in greater detail 
below in the section below entitled "Pipelined Multiplier " . 

Detailed Description Text (58) : 

INTEGER/ FRACTION is set by the Load Mode Register function and specifies whether 
data are to be operated on as integers or fractions. In general, results of 
multiplications operations are shifted left one bit in FRAC mode, and single-word 
operations use AH in FRAC mode or AL in INTG mode. 

Detailed Description Text (64) : 

ACCUMULATE is set by the Load Mode Register function. ACC affects only the Absolute 
Value and Multiply by sign of Y operations. When ACC=0, these operations clear the 
Accumulator and place the" new results into it. When ACC=1, these operations add the 
new results to the previous Accumulator contents. 

Detailed Description Text (76) : 

In addition to the data transfers stated above, if a Delayed Multiply or a Delayed 
Multiply -Round operation is pending, it will be started as follows: 

Detailed Description Text (82) : 

In addition to the data transfers, if a Delayed Multiply -Accumulate operation is 
pending, it will be started as follows: 

Detailed Description Text (91) : 

the mode change will be effective for the operation that follows DIF (MPY in this 
example) . Note, however, that if the LDMR is delayed, such as by an interrupt, the 
multiply will occur before the LDMR is executed, thus giving different results if 
the FRAC/INTG mode is changed. 



Detailed Description Text (103) : 
Multiply Group 

Detailed Description Text (104): 
MULTIPLY 



Detailed Description Text (106) : 
MULTIPLY-ROUND 



Detailed Description Text (107) : 

The product is formed as in MULTIPLY, but in FRAC mode it is added to the quantity 
2. sup. -16. The resulting value in AH is X*Y (or -X*Y) rounded to a one-word (16- 
bit) value. That is, if the most-significant bit in AL due to a MULTIPLY operation 
in FRAC mode would have been a 1, the MULTIPLY -ROUND operation will produce a value 
in AH which is rounded to the next higher number than would have been produced by 
the MULTIPLY . In INTG mode, MULTIPLY -ROUND is the same as MULTIPLY . This operation 
takes two cycles. 



Detailed Description Text (108): 
MULTIPLY-ACCUMULATE 
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Detailed Description Text (109) : 

The product is formed as in MULTIPLY but it is added to the previous contents of 
the Accumulator. In FRAC mode, the least-significant bit of AL is ignored, and is 
forced to 0 in the result. A full 4 0-bit accumulation is performed. If an overflow 
of AX occurs, the VX latch will be set to 1. 

Detailed Description Text (110) : 
Delayed Multiply Group 

Detailed Description Text (111) : 

DELAYED MULTIPLY - -DELAYED MULTIPLY -ROUND 

Detailed Description Text (112) : 

These operations produce the same results as their counterparts. However, they are 
not started following the LDX function, but instead following a subsequent RDA 
function. This permits loading two operands while a previous MULTIPLY operation is 
in progress, then reading the previous result before starting the new operation. 
The function that starts these delayed operations is determined by the mode as 
follows : 

Detailed Description Text (116) : 
DELAYED MULT I PLY - ACCUMULATE 

Detailed Description Text (123) : 

ADD and SUBTRACT are exactly equivalent to the MULT I PLY -ACCUMULATE operation with 
Y=l or -1, but are independent of the PLUS/MINUS mode. However, the contents of Y 
are ignored and are not affected. The value in X (integer or fraction) is added to 
or subtracted from the 40-bit Accumulator. In FRAC mode, the least-significant bit 
of AL is ignored, and is forced to 0 in the result. If an overflow of AX occurs, 
the VX latch will be set. These operations take two cycles in INTG mode, one cycle 
in FRAC mode. 

Detailed Description Text (142): 

These operations multiply the contents of X by +1 or -1 and place the result in the 
Accumulator. If the accumulate mode is set (ACC=1) , ABS and MSY add the result to 
the previous Accumulator contents. The PLUS/MINUS mode affects only the MSY 
operation . 

Detailed Description Text (145) : 

Multiply by Sign of Y (MSY) — The contents of X are multiplied by +1 or -1 according 
to the value in Y and the PLUS/MINUS mode as follows: 

Detailed Description Text (169) : 

The DIF operation is executed during the cycle immediately following the LDX 
function that specifies one of the above operations, including the delayed 
multiplies . Except for the delayed multiplies, the normal function occurs 
immediately following DIF. The delayed multiplies are activated in the same manner 
independent of the DIF operation. At the time the delayed operation is activated, 
the DIF will have been completed. 

Detailed Description Text (195) : 

Timing of the MAR is controlled by two clock signals supplied to the circuit, the 
high-speed clock (HCLK) and the cycle clock (CCLK) . There are two timing modes 
determined by the AMODE input, synchronous mode (AMODE=low) and asynchronous mode 
(AMODE=high) . HCLK is a continuous clock which must be supplied regardless of the 
mode. It runs at a multiple of the cycle frequency and is used to provide internal 
timing signals. CCLK is used to synchronize the internal timing to external events 
when necessary, and to initiate asynchronous internal functions. Data Transfer 
Acknowledge (DTA) is an output used in the asynchronous mode for handshaking with a 
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host computer. 

Detailed Description Text (200) : 

The minimum period of CCLK is three full cycles of HCLK, with a nominal 50% duty 
cycle. To extend the cycle, e.g., to transfer data to or from a slower device 
connected to the data bus, the low (active) phase of CCLK can be extended in 
multiples of one full cycle of HCLK. The start of the cycle may be delayed by 
extending the high phase of CCLK also in multiples of one full cycle of HCLK. 

Detailed Description Text (217) : 

Operation of Two-Pass Digital Multiplier /Accumulator Circuit 
Detailed Description Text (219) : 

The M/A multiplies the contents of the two operand data registers, X and Y, and 
adds the result (product) to the Accumulator A. Optionally, the Accumulator may be 
cleared to zero at the start of the operation so that the product of X and Y is 
obtained without accumulating the previous contents of A. 

Detailed Description Text (220) : 

The multiplication and addition to A are accomplished by passing the Accumulator 
contents twice through the array of Adders. The array comprises Adders 101-104 
(FIG. 3) in cascade. Each Adder adds an appropriate multiple of the contents of the 
X Register to the accumulator or to the output of the previous Adder. The multiples 
are selected according to the contents of the Y Register. Eight bits (Y 1 ) of Y are 
selected during each pass thus producing a 24-bit result. During pass 1, the low- 
order 8 bits of Y are selected and the product is added to the low-order 24 bits of 
A. During pass 2, the high-order 8 bits of Y are selected and the product is added 
to the high order 24 bits of A. 

Detailed Description Text (224): 

These multiplex of X are all powers of two and thus require only a left shift 
and/or inversion of X or forcing of zeros as an input to each Adder. 

Detailed Description Text (225) : 

All operands and results are considered as two 1 s-complement signed numbers. 
Therefore on pass 2, Y 1 may be negative whereas on pass 1 it is always considered 
positive. The Z values are determined by adding the binary value 01010101 to Y 1 and 
decoding as per Table 6. The powers of 4 (4, 16, 64) are obtained by shifting the 
contents of X left two bits at the input of each successive adder stage as shown in 
Table 6. 

Detailed Description Text (226) : 

Since left-shifting of X produces zeros in the low-order bit positions, Adder 
stages are not required in the low-order positions for the higher multiples of X, 
and they are not implemented. The high-order bits of the Adders (beyond 16 bits) 
are required only to handle the extended Sign of the selected multiple of X, and a 
possible ripple Carry. Two additional bits (for a total of 18 bits per Adder) can 
take care of this Sign and Carry with a circuit which provides inputs to the two 
high-order Adder positions. 

Detailed Description Text (231) : 

The use of a 2-bit-at-a-time algorithm in a parallel multiplier eliminates one-half 
of the required Adders at the expense of providing a like number of data selectors. 
In an NMOS circuit, data selectors are made of multiple transmission gates, occupy 
minimal space, and consume no DC power (except for drivers) . The resulting array is 
therefore considerably smaller and consumes less power than if a conventional 
parallel array were used. 

Detailed Description Text (232) : 

The number of Adders and data selectors is halved again by making two passes 
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through the array to complete the multiply . This two-cycle multiple matches very 
well with the two cycles required to load two operands via the single 16-bit Data 
Bus 15. As explained below, buffers are provided for the X and Y Registers so that 
new operands may be loaded while the previously-initiated multiplication is in 
progress. The penalty for two-cycle multiplication is additional timing logic 
required, which is much less than the logic saved by halving the Adder array. Three 
Carry bits must also be saved between passes, but again the required logic is 
minimal. 

Detailed Description Text (233) : 
Operation of Pipelined Multiplier 

Detailed Description Text (235) : 

The two operand buffer Registers X and Y permit two new operands to be loaded 
during the 2 cycles in which a multiply operation is taking place on the two 
operands which were, loaded previously. A Function input specifies the register that 
data is to be loaded or read from during each cycle, and it also specifies, in the 
case of data to be loaded, the operation to be performed on that data during 
subsequent cycles. The Control and Timing circuitry implement the loaading, 
reading, and multiplying . 

Detailed Description Text (242) : 
Load X -Multiply 

Detailed Description Text (243) : 
Load X -Multiply -Accumulate 

Detailed Description Text (244): 

The Load X -Multiply and Load X -Multiply -Accumulate functions cause the product of X 
and Y to be formed and added to the Accumulator. However, Load X -Multiply first 
causes the Accumulator to be cleared to zero. 

Detailed Description Text (245) : 

These functions permit a pipelined multiply -accumulate as follows for the function: 
Sum (r.sub.i .multidot . s . sub . i) i=l to n: 

Detailed Description Text (247) : 

For a sequence of non-accumulating multiples of the form: 
Detailed Description Text (248) : 

using only the functions described, six cycles are required per multiply, two of 
which are wait cycles. In a typical application, only 16-bits of precision are 
required in the result, so only one READ A needs to be performed, and 5 cycles are 
sufficient per multiply, two of which are WAITs. 

Detailed Description Text (249) : 

To improve the efficiency of this type of calculation, another function is added: 
LOAD X-DELAYED MULTIPLY . This function loads data into the X Register and 
establishes the operation to be performed, but it does not initiate the multiply 
operation. The multiply operation is automatically initiated following a subsequent 
READ A. This function permits two operands to be loaded while a previous 
multiplication is in progress, but permits the previous result to be read before 
the new operation is performed as shown below. 

Detailed Description Text (250) : 

The total number of cycles is 3n+2 whereas 5n cycles would be required without the 
DELAYED -MULTIPLY function. 

Detailed Description Text (252) : 

This function is LOADX-DELAYED MULTIPLY-ACCUMULATE. In this case, the operation is 
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initiated by a LOAD A function rather than a READ A. 
Detailed Description Text (254): 

The total number of cycles is 4n+2 versus 6n without the DELAYED MULTIPLY - 
ACCUMULATE function. 

Detailed Description Text (276) : 

It will be apparent to those skilled in the art that the disclosed Two-Pass 
Multiplier /Accumulator Circuit may be modified in numerous ways and may assume many 
embodiments other than the preferred form specifically set out and described above. 



Detailed Description Text (277) : 

For example, the Multiplier /Accumulator 1 can be implemented independently of the 
RAM 50. 

Detailed Description Paragraph Table (5) : 

ADD, SUB all multiples 

if . vert line . Y . vert line . .ltoreq. 127 INTG mode only ABS, MSY with ACC = 1 



Detailed Description Paragraph Table (14) : 

Single Multiply : LDYP Yl LDSMPY XI NOOP 

Multiply operation in progress NOOP RDAH X1*Y1 (MS) Result (Most-significant half) 
RDAL S1*Y1 (LS) Result (least-significant half) Logical AND (INTG mode): LDAL Al 
LDXAND XI NOOP AND operation in progress RDAL Al .multidot. XI Result Multiply - 
Round and ADD (FRAC mode) : LDYM Yl LDXMPR XI NOOP Multiply operation in progress 
LDXADD X2 NOOP Add operation in progress RDAH -(X *Y ) +Y2 (Rounded to single word) 
Square of Difference (INTG mode) : LDYDIF Yl LDXMPY XI NOOP Difference operation in 
progress NOOP Multiply operation in progress NOOP RDAL (Yl/2 - Xl/2).sup.2 (Least 
significant part) Difference with third operand (FRAC mode) : LDYDIF Yl LDXDMP XI 
LDXMPY X2 Difference operation (Difference .fwdarw. Y only) NOOP Multiply operation 
in progress NOOP RDAH (Yl/2 — Xl/2)*2 (most-significant part) Multiply -Accumulate : 
LDYP Yl LDXMPY XI LDYP Y2 First multiply in progress LDXMAC Second operand pair 
loaded . Second multiply in progress . LDYP Yn LDXMAC Xn NOOP Last multiply in 
progress NOOP RDAL RDAH Result = (X1*Y1) -f (X2*Y2) + + (Xn*Yn) RDAX Maximum 
Value (FRAC mode) : LDAH X0 LDXMAX XI NOOP First maximum operation in progress 
LDXMAX X2 NOOP LDXMAX X3 Second maximum operation in progress . . . LDXMAX Xn NOOP 
Last maximum operation in progress NOOP RDAH X (Max) Result = maximum of (X0, XI, - 
-,Xn) RDAL i Index: X (MAX) = Xi Array multiplication (delayed multiply ) (FRAC 
mode) : LDYP Yl LDXMPR XI LDYP Y2 First multiply in progress LDXDMR X2 Load second 
operand pair RDAH .circle. 1 X1*Y1 Read first result LDYP Y3 Second multiply in 
progress LDXDMR X3 Load third operand pair RDAH .circle. 1 X2*Y2 . . . LDYP Yn 
Multiply in progress LDXDMR Xm Load last operand pair RDAH .circle. 1 Xn-l*Yn-l Read 
result NOOP Last multiply in progress NOOP RDAH .circle. 1 Xn*Yn Read last 
result .circle. 1 If double-word result is desired, read AL first; RDAH starts 
delayed operation in FRAC mode. Array multiply and Add (Z ! .sub.i = X.sub.i *Y.sub.i 
+ Z.sub.i) (INTG mode): LDYP Yl Load first operand pair LDXDMA XI LDAL . circle . 1 Zl 
Load first update value LDYP Y2 First multiply -accumulate operation in progress 
LDXDMA X2 Load second operand pair RDAL Z f l Read first result LDAL .circle. 1 Z2 
Load second update value LDYP Y3 Second multiply -accumulate operation in progress 
LDXDMA X3 RDAL Z'2 Read second result . . . LDAL .circle. 1 Zn NOOP Last multiply - 
accumulate operation in progress NOOP RDAL Z'n Read last result .circle. 1 If 
double-precision operation is desired, load AH first; LDAL starts DMA operation in 
INTG mode. Multiply and scale result (INTG mode): LDYP Yl LDXMPY XI NOOP Multiply 
operation in progress NOOP ASR Shift function ASR Shift function RDAL (Xl*Yl)/4 
Result Divide (Fractional data, FRAC Mode): LDAH Al (Most-significant) LDAL Al 
(Least-significant) LDXSUB XI LDXDIV XI Subtract operation in progress NOOP NOOP 
NOOP NOOP NOOP NOOP NOOP NOOP Divide operation in progress NOOP (16 cycles) NOOP 
NOOP NOOP NOOP NOOP NOOP NOOP RDAL Quotient^ ( Al/Xl ) LDXADD XI NOOP Add operation in 
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progress RDAH Remainder Divide (Integer data, FRAC mode) : LDAH Al (Most 
significant) LDAL Al (Least significant) ASL Pre-shift divisor LDXSUB XI LDXDIV XI 
Subtract operation in progress NOOP NOOP NOOP NOOP NOOP NOOP NOOP Divide operation 
in progress NOOP (16 cycles) NOOP NOOP NOOP NOOP NOOP NOOP NOOP NOOP RDAL Quotient= 
(Al/Xl) LDXADD NOOP Add operation in progress ASR RDAH Remainder 



Detailed Description Paragraph Table (17): 

Function Data Operation in Progress 

LOAD Y r.sub.l None LOAD X -MULTIPLY s.sub.l 

LOAD Y r.sub.2 r.sub.l .multidot. s.sub.l LOAD X-MULT-ACC s.sub.2 LOAD Y r.sub.3 A 

+ r.sub.2 .multidot. s.sub.2 LOAD X-MULT-ACC s.sub.3 LOAD Y r.sub.n A + 

r.sub.n-1 .multidot. s.sub.n-1 LOAD X-MULT-ACC s.sub.n WAIT -- A + 
r.sub.n .multidot. s.sub.n WAIT -- READ AH (r .multidot. s) READ AL 



Detailed Description Paragraph Table (18) : 

Function Data Operation in Progress 

LOAD Y r.sub.l none LOAD X -MULTIPLY s.sub.l 

LOAD Y r.sub.2 r.sub.l .multidot. s.sub.l LOAD X-DEL . MULT . s.sub.2 READ A t.sub.l 
LOAD Y r.sub.3 r.sub.2 .multidot. s.sub.2 LOAD X-DEL. MULT, s.sub.3 READ A t.sub.2 . 
. LOAD Y r.sub.n r.sub.n-1 .multidot. s.sub.n-1 LOAD X-DEL. MULT, s.sub.n READ A 
t.sub.n-1 WAIT — r.sub.n .multidot. s.sub.n WAIT — READ A t.sub.n 



Detailed Description Paragraph Table (23) : 

TABLE 2 __ 

M/A FUNCTIONS 43210 DIF MNEMONIC DATA TRANSFER FUNCTION/OPERATION CONTROL 

00000 

RDAL AL .fwdarw. B If INTG mode: start DELAYED 00001 RDAH AH .fwdarw. B If FRAC 
mode: MULTIPLY or 00010 RDALL AL (LIM) .fwdarw. B . circle. 1 If INTG mode: MULTIPLY- 
ROUND 00011 RDAHL AH (LIM) .fwdarw. B . circle. 1 If FRAC mode: 00100 RDAX 
AX .fwdarw. B None 00101 RDST ST, MR .fwdarw. B None 00110 ASL A (LI) .fwdarw. A 
Shift A (40 bits) left with 0 fill 00111 ASR A (Rl) .fwdarw. A or right with sign 
extension, 1 bit 01000 * LDXMPY B .fwdarw. X Start MULTIPLY 01001 * LDXMPR 
B .fwdarw. X Start MULTIPLY -ROUND 01010 * LDXMAC B .fwdarw. X Start MULTI PLY - 
ACCUMULATE 01011 * LDXABS B .fwdarw. X Start ABSOLUTE 01100 LDXADD B .fwdarw. X 
Start ADD 01101 LDXSUB B .fwdarw. X Start SUBTRACT OHIO LDAL B .fwdarw. 
AL .circle. 2 If INTG mode: start DELAYED 01111 LDAH B .fwdarw. AH .circle. 3 If FRAC 
mode: MULTI PLY -ACCUMULATE 10000 LDYPM B .fwdarw. Y Set PLUS/MINUS mode (MREV = 0/1) 
10001 LDYDIF B .fwdarw. Y Set DIF 10010 LDYP B .fwdarw. Y Set PLUS mode 10011 LDYM 
B .fwdarw. Y Set MINUS mode 10100 * LDXMAX B .fwdarw. X Start MAXIMUM 10101 * 
LDXMIN B .fwdarw. X Start MINIMUM 10110 LDXAND B .fwdarw. X Start LOGICAL AND 10111 
LDMR B .fwdarw. MR Set/Clear modes 11000 * LDXDMP B .fwdarw. X Set MULTI PLY - DELAYED 
11001 * LDXDMR B .fwdarw. X Set MULTIPLY -ROUND DELAYED 11010 * LDXDMA B .fwdarw. X 
Set MULTI PLY -ACCUMULATE DELAYED 11011 LDXMSY B .fwdarw. X Start SIGN MULTIPLY 11100 
* LDXCMP B .fwdarw. X Start COMPARE 11101 * LDXDIV B .fwdarw. X Start DIVIDE 11110 
LDXEOR B .fwdarw. X Start LOGICAL EXCLUSIVE OR 11111 LDXNEG B .fwdarw. X Start 

NEGATE 

*Start DIFFERENCE mode if pending. See infra for details of Read A Limited, 
.circle. 2 If FIRST LOAD, propagate sign into AH & AX, then clear FIRST LOAD, 
.circle. 3 Propagate sign into AX : if FIRST LOAD: 0 .fwdarw. AL, then clear FIRST 
LOAD. All LDX functions set FIRST LOAD and FIRST READ. 

Detailed Description Paragraph Table (25) : 

TABLE 4 __ 

M/A OPERATIONS MATHEMATICAL # CYCLES OPERATION MNEMONIC CONDITION DESCRIPTION FRAC 

INTG 

DIFFERENCE DIF (Y/2 - X/2) .fwdarw. Y and 1 1 MULTIPLY MPY, DMP .+-. X * Y .fwdarw. 
A 2 2 MULTIPLY -ROUND MPR, DMR FRAC 2. sup. -16 .+-. X * Y .fwdarw. A 2 — INTG .+-. X 
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* Y .fwdarw. A -- 2 MULTIPLY - MAC, DMA A .+-. X * Y .fwdarw. A 2 2 ACCUMULATE ADD 
ADD A + X .fwdarw. A 1 2 SUBTRACT SUB A - X .fwdarw. A 1 2 COMPARE CMP MAG = 0 Am - 
X ; sign .fwdarw. S 1 1 MAG = 1 . vertline .Am. vert line . - . vertline .X . vertline . ; 
sign .fwdarw. S 1 1 AX + 1 .fwdarw. AX: [COMPARE] MAXIMUM MAX 2 2 if S = 1: 
X .fwdarw. Am, AX .fwdarw. An AX + 1 .fwdarw. AX: [COMPARE] MINIMUM MIN 2 2 if S = 
0: X .fwdarw. Am, AX .fwdarw. An NEGATE NEG -X .fwdarw. All ABSOLUTE ABS ACC = 

0 . vertline. X. vertline. .fwdarw. All ACC = 1 A + . vertline . X . vertline . .fwdarw. A 

1 2 SIGN MULTIPLY MSY ACC = 0 .+-.X * sign(Y) .fwdarw. All ACC = 1 A .+-.X * sign 
(Y) .fwdarw. A 1 2 LOGICAL AND AND X .multidot. Am .fwdarw. Am; 0 .fwdarw. An & AX 

1 1 EXCLUSIVE OR EOR X .sym. Am .fwdarw. Am; 0 .fwdarw. An & IX 1 AH (MSB) .fwdarw. 
M DIVIDE DIV FRAC 2 * A .+-. X + — M * 2. sup. -31 .fwdarw. 16 — (16 times) INTG 
Undefined -- 16 

.+-.: 

Except DIVIDE: + in PLUS mode, - in MINUS mode. DIVIDE: + if M = 1, - if M = 0 Am: 
AH in FRAC mode, AL in INTG mode An: AL in FRAC mode, AH in INTG mode 

CLAIMS : 

1. A digital multiplier /accumulator circuit for generating the product of first and 
second binary numbers, said circuit comprising: 

an M-bit multiplicand register for storing said first binary number, said first 
binary number having M or fewer bits; 

an N-bit multiplier register for storing said second binary number, said second 
binary number having N or fewer bits, said multiplier register being divided into 
first and second equal portions; 

control logic for controlling the operation of said multiplier circuit, said 
control logic generating at least one control signal in response to one of a 
plurality of commands applied thereto; 

an array of P multiplier decoders, where P=N/4, for decoding a multiplier operand 
which is stored in either said first portion or said second portion of said 
multiplier register, and for generating either a first plurality of multiplier 
decoder outputs or a second plurality of multiplier decoder outputs in response to 
first and second control signals, respectively, generated successively by said 
control logic; 

the first of said decoders decoding the 2 least significant bits of said multiplier 
operand and generating 1 of 4 possible first decoder outputs representing 
multiplication of said multiplicand by the factors 0, +1, -1, or 2, respectively; 

a second of said decoders decoding the next 2 higher significant bits of said 
multiplier operand, and generating 1 of 4 possible second decoder outputs 
representing multiplication of said multiplicand by the factors 0, +2, -2, or 4, 
respectively; 

each successive decoder decoding the next 2 higher significant bits of said 
multiplier operand, with the Pth of said decoders decoding the 2 most significant 
bits of said multiplier operand and generating 1 of 4 possible Pth decoder outputs 
representing multiplication of said multiplicand by the factors 0, +2 . sup. 2 ( P-l ) , 
or -2 . sup. 2 (P-l) , or 2.sup.2P-l, respectively, where P represents the Pth decoder; 

an accumulator register having 2(M+P) bit positions for storing results of the 
operations of said digital multiplier /accumulator circuit; 

an array of P full adders, each being (M+2) bits in length, 

said control logic being responsive to said first control signal for causing the 
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least significant M stages of said first adder to be responsive both to the 
contents of corresponding bits of said multiplicand register and to the output of 
said first multiplier decoder, and to generate a partial sum equal to the product 
of said first binary number and said first multiplier decoder output, the two least 
significant bits of said partial sum being stored in the two least significant bit 
positions of said accumulator; 

the least significant M stages of each successive adder being responsive to the 
contents of corresponding bits of said multiplicand register, to the M most 
significant bits of the partial product generated by the previous adder, and to the 
corresponding successive multiplier decoder output, and each generating a partial 
sum equal to the sum of the contents of the M most significant stages of the 
previous adder and the product of said first binary number multiplied by said 
decoder output, the two least significant bits of each successive partial sum being 
stored in successively higher significant bit positions of said accumulator, the 
entire (M+2) bits of the Pth partial sum also being stored in bit positions of said 
accumulator which are adjacent to and successively higher than said two least 
significant bits of the (P-l)th partial sum; 

said control logic being responsive to said second control signal for conducting 
the (M+l)th through (2M+2)th bits of said accumulator to the 1st through (M+2)th 
stages, respectively, of said first adder and enabling the least significant M 
stages of said first adder to be responsive to the contents of corresponding bits 
of said multiplicand register, to the contents of said (M+l)th through (2M)th bits 
of said accumulator, and to said first multiplier decoder generating one of said 
second plurality of multiplier decoder outputs, and to thereby generate a patrial 
sum equal to the sum of said (M+l)th through (2M)th bits of said accumulator and 
the product of said first binary number and said first multiplier decoder output, 
the two least significant bits of said partial sum being stored in the (M+l)th and 
(M+2)th bit positions of said accumulator; 

the (M+l)th and (M+2)th stages of said first adder being responsive to the contents 
of the (2M+l)th and (2M+2)th bits, respectively, of said accumulator and generating 
a partial sum; 

the two most significant stages of the Pth successive adder being responsive to the 
contents of the (2(M+P)-l)th and 2(M+P)th bits, respectively, of said accumulator 
and generating a partial sum; 

the least significant M stages of each successive adder being responsive to the 
contents of corresponding bits of sail multiplicand register, to the M most 
significant bits of the partial product generated by the previous adder, and to the 
corresponding successive multiplier decoder output, and each generating a partial 
sum equal to the sum of the contents of the M most significant stages of the 
previous adder and the product of said first binary number multiplied by said 
decoder output, the two least significant bits of each successive partial sum being 
stored in successively higher significant bit positions of said accumulator, the 
entire M+2 bits of the Pth partial sum also being stored in bit positions of said 
accumulator which are adjacent to and successively higher than said two least 
significant bits of the (P-l)th partial sum; 

whereby after the successive generation of said first and second control signals, 
the product of said first and second binary numbers is stored in said accumulator 
register . 

2. The digital multiplier /accumulator circuit as recited in claim 1 and further 
comprising : 

an array of P sign/carry circuits, one being associated with each of said full 
adders, each sign/carry circuit generating a sign signal and a carry signal, the 
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first sign/carry circuit being responsive to sign and carry signals provided by 
said accumulator, each successive sign/carry circuit being responsive to a sign 
signal provided by the previous sign/carry circuit and to a carry signal provided 
by the most significant stage of the previous full adder. 

3. A digital multiplier /accumulator circuit for generating the product of first and 
second binary numbers, said binary numbers each having 16 or fewer bits, said 
circuit comprising : 

a 16-bit multiplicand register for storing said first binary number; 

a 16-bit multiplier register for storing said second binary number, said multiplier 
register being divided into first and second 8-bit portions; 

control logic for controlling the operation of said multiplier circuit, said 
control logic generating at least one control signal in response to one of a 
plurality of commands applied thereto; 

an array of 4 multiplier decoders for decoding a multiplier operand which is stored 
in either said first portion or said second portion of said multiplier register, 
and for generating either a first plurality of multiplier decoder outputs or a 
second plurality of multiplier decoder outputs in response to first and second 
control signals, respectively, generated successively by said control logic; 

the first of said decoders decoding the 2 least significant bits of said multiplier 
operand and generating 1 of 4 possible first decoder outputs representing 
multiplication of said multiplicand by the factors 0, +1, -1, or 2, respectively; 

a second of said decoders decoding the next 2 higher significant bits of said 
multiplier operand, and generating 1 of 4 possible second decoder outputs 
representing multiplication of said multiplicand by the factors 0, +2, -2, or 4, 
respectively; 

each successive decoder decoding the next 2 higher significant bits of said 
multiplier operand, with the 4th of said decoders decoding the 2 most significant 
bits of said multiplier operand, the Pth of said decoders generating 1 of 4 
possible fourth decoder outputs representing multiplication of said multiplicand by 
the factors 0, +2 . sup . 2 ( P-l ) , -2 . sup. 2 ( P-l) , or 2.sup.2P-l, respectively, where P 
represents the Pth decoder; 

an accumulator register having 40 bit positions for storing results of the 
operations of said digital multiplier /accumulator circuit; 

an array of 4 full adders, each being 18 bits in length, 

said control logic being responsive to said first control signal for causing the 
least significant 16 stages of said first adder to be responsive both to the 
contents of corresponding bits of said multiplicand register and to the output of 
said first multiplier decoder, and to generate a partial sum equal to the product 
of said first binary number and said first multiplier decoder output, the two least 
significant bits of said partial sum being stored in the two least significant bit 
positions of said accumulator; 

the least significant 16 stages of each successive adder being responsive to the 
contents of corresponding bits of said multiplicand register, to the 16 most 
significant bits of the partial product generated by the previous adder, and to the 
corresponding successive multiplier decoder output, and each generating a partial 
sum equal to the sum of the contents of the 16 most significant stages of the 
previous adder and the product of said first binary number multiplied by said 
decoder output, the two least significant bits of each successive partial sum being 
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stored in successively higher significant bit positions of said accumulator, the 
entire 18 bits of the fourth partial sum also being stored in bit positions of said 
accumulator which are adjacent to and successively higher than said two least 
significant bits of the third partial sum; 

said control logic being responsive to said second control signal for conducting 
the 17th through 34th bits of said accumulator to the 1st through 18th stages, 
respectively, of said first adder, and enabling the least significant 16 stages of 
said first adder to be responsive to the contents of corresponding bits of said 
multiplicand register, to the contents of said 17th through 32nd bits of said 
accumulator, and to said first multiplier decoder generating one of said second 
plurality of multiplier decoder outputs, and to thereby generate a partial sum 
equal to the sum of said 17th through 32nd bits of said accumulator and the product 
of said first binary number and said first multiplier decoder output, the two least 
significant bits of said partial sum being stored in the 17th and 18th bit 
positions of said accumulator; 

the 17th and 18th stages of said first adder being responsive to the contents of 
the 33rd and 34th bits, respectively, of said accumulator and generating a partial 
sum; 

the two most significant stages of the Pth successive adder being responsive to the 
contents of the (2(M+P)-l)th and 2(M+P)th bits, respectively, of said accumulator 
and generating a partial sum; 

the least significant 16 stages of each successive adder being responsive to the 
contents of corresponding bits of said multiplicand register, to the 16 most 
significant bits of the partial product generated by the previous adder, and to the 
corresponding successive multiplier decoder output, and each generating a partial 
sum equal to the sum of the 16 most significant bits of the partial product 
generated by the previous adder and the product of said first binary number 
multiplied by said decoder output, the two least significant bits of each 
successive partial sum being stored in successively higher significant bit 
positions of said accumulator, the entire 18 bits of the 4th partial sum also being 
stored in bit positions of said accumulator which are adjacent to and successively 
higher than said two least significant bits of the 3rd partial sum; 

whereby after the successive generation of said first and second control signals, 
the product of said first and second binary numbers is stored in said accumulator 
register . 

4 . The digital multiplier /accumulator circuit as recited in claim 3 and further 
comprising : 

an array of 4 sign/carry circuits, one being associated with each of said full 
adders, each sign/carry circuit generating a sign signal and a carry signal, the 
first sign/carry circuit being responsive to sign and carry signals provided by 
said accumulator, each successive sign/carry circuit being responsive to a sign 
signal provided by the previous sign/carry circuit and to a carry signal provided 
by the most significant stage of the previous full adder. 

5. A method for multiplying first and second binary numbers, said method 
comprising : 

providing an M-bit multiplicand register for storing said first binary number, said 
first binary number having M of fewer bits; 

providing an N-bit multiplier register for storing said second binary number, said 
second binary number having N or fewer bits, said multiplier register being divided 
into first and second equal portions; 
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providing an array of P multiplier decoders, where P=N/4; 

employing said array of multiplier decoders to decode a multiplier operand which is 
stored in said first portion of said multiplier register, and generating a first 
plurality of multiplier decoder outputs, 

the first of said decoders decoding the 2 least significant bits of said multiplier 
operand and generating 1 of 4 possible first decoder outputs representing 
multiplication of said multiplicand by the factors 0, +1, -1, or 2, respectively; 

a second of said decoders decoding the next 2 higher significant bits of said 
multiplier operand, and generating 1 of 4 possible second decoder outputs 
representing multiplication of said multiplicand by the factors 0, +2, -2, or 4, 
respectively; 

each successive decoder decoding the next 2 higher significant bits of said 
multiplier operand, with the Pth of said decoders decoding the 2 most significant 
bits of said multiplier operand and generating 1 of 4 possible Pth decoder outputs 
representing multiplication of said multiplicand by the factors 0, +2 . sup. 2 (P-l) , - 
2. sup. 2 (P-l) , or 2.sup.2P-l, respectively, where P represents the Pth decoder; 

providing an accumulator register having 2 (M+P) bit positions for storing results 
of the operations of said digital multiplier /accumulator circuit; 

providing an array of P full adders, each being (M+2) bits in length; 

causing the least significant M stages of said first adder to be responsive both to 
the contents of corresponding bits of said multiplicand register and to the output 
of said first multiplier decoder, and to generate a partial sum equal to the 
product of said first binary number and said first multiplier decoder output, the 
two least significant bits of said partial sum being stored in the two least 
significant bit positions of said accumulator; 

the least significant M stages of each successive adder being responsive to the 
contents of corresponding bits of said multiplicand register, to the M most 
significant bits of the partial product generated by the previous adder, and to the 
corresponding successive multiplier decoder output, and each generating a partial 
sum equal to the sum of the contents of the M most significant stages of the 
previous adder and the product of said first binary number multiplied by said 
decoder output, the two least significant bits of each successive partial sum being 
stored in successively higher significant bit positions of said accumulator, the 
entire (M+2) bits of the Pth partial sum also being stored in bit positions of said 
accumulator which are adjacent "to and successively higher than said two least 
significant bits of the (P-l)th partial sum; 

conducting the (M+l)th through (2M+2)th bits of said accumulator to the 1st through 
(M+2)th stages, respectively, of said first adder, and enabling the least 
significant M stages of said first adder to be responsive to the contents of 
corresponding bits of said multiplicand register, to the contents of said (M+l)th 
through (2M)th bits of said accumulator, and to said first multiplier decoder 
generating one of said second plurality of multiplier decoder outputs, and to 
thereby generate a partial sum equal to the sum of said (M+l)th through (2M)th bits 
of said accumulator and the product of said first binary number and said first 
multiplier decoder output, the two least significant bits of said partial sum being 
stored in the (M+l)th and (M+2)th bit positions of said accumulator; 

the (M+l)th and (M+2)th stages of said first adder being responsive to the contents 
of the (2M+l)th and (2m+2)th bits, respectively, of said accumulator and generating 
a partial sum; 
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the two most significant stages of the Pth successive adder being responsive to the 
contents of the (2(M+P)-l)th and 2(M+P)th bits, respectively, of said accumulator 
and generating a partial sum; 

the least significant M stages of each successive adder being responsive to the 
contents of corresponding bits of said multiplicand register, to the M most 
significant bits of the partial product generated by the previous adder, and to the 
corresponding successive multiplier decoder output, and each generating a partial 
sum equal to the sum of the contents of the M most significant stages of the 
previous adder and the product of said first binary number multiplied by said 
decoder output, the two least significant bits of each successive partial sum being 
stored in successively higher significant bit positions of said accumulator, the 
entire M+2 bits of the Pth partial sum also being stored in bit positions of said 
accumulator which are adjacent to and successively higher than said two least 
significant bits of the (P-l)th partial sum; 

whereby after two successive multiplications of said first binary number by said 
first and second portions of said second binary number, respectively, the product 
of said first and second binary numbers is stored in said accumulator register. 
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